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In recent years, so-called "effectiveness studies, " also 
called "real-world studies" or "pragmatic trials," have 
gained increasing importance in the context of evidence- 
based medicine. These studies follow less restrictive 
methodological standards than phase III studies in terms 
of patient selection, comedication, and other design 
issues, and their results should therefore be better gener- 
alizable than those of phase III trials. Effectiveness studies, 
like other types of phase IV studies, can therefore con- 
tribute to knowledge about medications and supply rele- 
vant information in addition to that gained from phase III 
trials. However, the less restrictive design and inherent 
methodological problems of phase IV studies have to be 
carefully considered. For example, the greater variance 
caused by the different kinds of confounders as well as 
problematic design issues, such as insensitive primary out- 
come criteria, unblinded treatment conditions, inclusion 
of chronic refractory patients, etc, can lead to wrong con- 
clusions. Due to these methodological problems, effec- 
tiveness studies are on a principally lower level of evi- 
dence, adding only a complementary view to the results 
of phase III trials without falsifying their results. 
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n the context of evidence-based medicine, 1 
randomized control-group trials (RCTs) are considered 
to be the decisive level of scientifically proven evidence 
as far as therapeutic aspects are concerned. 2 Placebo- 
controlled trials, especially for certain psychiatric indi- 
cations, are ranked higher in terms of evidence than 
active control-group studies. 3 Especially in terms of 
licensing perspectives, there is a demand from the 
European Medicines Agency and the Food and Drug 
Administration to demonstrate efficacy based on RCTs 
including a placebo control group for obvious method- 
ological reasons. The knowledge gained from noninter- 
ventional (observational) studies (NIS) as well as from 
single-case studies is only seen as being relevant when it 
is an addition to such studies or a replacement in indi- 
cations where empirical studies of a higher method- 
ological degree are lacking. This view corresponds to the 
general methodological understanding of empirical 
research. Evidence graduation is geared to the fact that 
for methodological reasons certain study designs yield 
results that are more likely to be reliable. This corre- 
sponds with the rules of the methodology of empirical 
research. 45 Thus, randomized control-group studies have 
a higher value than nonrandomized or uncontrolled 
studies. 
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Do effectiveness studies tell us the truth? 

There is a general consensus that the results of phase III 
studies are not fully generalizable: they have a high 
internal validity but insufficient external validity. One of 
the reasons for this is the strict selection of patients 
according to various clinically relevant characteristics 
such as the exclusion of suicidality, comorbidity, etc. For 
this reason it has long been a tradition within clinical 
psychopharmacology to complement the phase III trial 
results with ones more strongly oriented towards every- 
day clinical practice and conditions, ie, studies in patients 
who better represent the "average" patients and treated 
under conditions as close as possible to "routine" care, 
eg, phase IV studies (Figure 1). However, it has thereby 
always been stressed that because of many immanent 
methodological problems, eg, biases due to lack of dou- 
ble-blind conditions or any blinding, such as phase nat- 
uralistic observational studies (NIS), only deliver com- 
plementary knowledge and cannot falsify the results of 
phase III studies. 6 

However, this strict rule can be weakened if the phase 
IV studies are performed, like phase III studies, as ran- 
domized control-group studies in an unblinded or even 
in blind or double-blind approach. 
Some experts seem inclined to attach a greater impor- 
tance to the results of these studies than to the method- 
ologically stricter phase III studies. 7 This might in par- 
ticular be the result from criticism arising from the 



increasingly common practice, especially in the USA, to 
include, in phase III studies, not "real" patients from care 
settings, but suitable persons found through advertise- 
ments. Of course, rather than this questionable 
approach, properly performed phase III studies in "real" 
patients should be advocated. Even so, some experts 
judge the "real-world approach" of effectiveness studies 
to be more valuable than phase III trials, at least in terms 
of clinical relevance. 

Some methodological considerations on 
effectiveness studies 

Effectiveness studies are intended to fill the gap between 
methodologically rigorous RCTs in the sense of phase 
III trials and naturalistic observational studies. As such, 
they are hybrids of the RCT methodology and natural- 
istic designs and are therefore termed "practical clinical 
trials." 8 They are intentionally designed to evaluate the 
effectiveness of the treatments under real-world condi- 
tions and in patient samples representative of everyday 
clinical practice (Table I). They can be performed as 
RCTs, but less demanding designs are also possible. If 
they use even a blind 9 or double-blind 10 RCT approach 
they come close to phase III trials considering design 
aspects, with the only difference being that patient selec- 
tion is not that restrictive and that, eg, comorbidity or 
comedication are allowed. 

In order to avoid guidelines completely losing their rela- 
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Figure 1. The 4-phase model of clinical psychopharmacology. RCT, randomized controlled trial 
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Clinical trials of "effectiveness" 

More relaxed exclusion criteria, permitting wider range of: 
Patients (eg, comorbidity not excluded) 
Treatment settings and interventions (including adjunctive 
treatments) 

Emphasis on clinical need to determine treatment doses, etc 
Levels and/or types of psychopathology 
Forms of outcome criteria, such as: 
Time to discontinuation 
Quality of life 

Preference of self-rating instruments or global ratings 
Advantages 

Higher external validity 

Arguably greater applicability to "real-world" practice settings 
Capacity to inform policy process 
Longer duration can be easier achieved 
Can enrol large number of patients more easily 
Disadvantages 

Internal validity limited 

Cannot be used to examine effective dose ranges 
Cannot make as meaningful clinical comparisons between 
agents 
Clinical trials of "efficacy" 

Highly restricted inclusion criteria to reduce confounding biases 
Randomization and blinding, also to reduce bias 
Treatment driven exclusively by study protocol 

Patients remain only in the treatment group originally assigned 

Fewer treatment adjustments are allowed 

Strict limitations on adjunctive treatment 

Measures taken to insure all members of treatment group 

receive same intervention(s) 
Use of well-validated outcome assessment 
Advantages 

Higher internal validity for clinical effects 

Higher internal validity for adverse effects, tolerability 

Contextual and human factors controlled for 

Considered "best quality" clinical evidence for informing 

treatment decisions 
Disadvantages 

Stringent inclusion criteria limit external validity 

Outcome measures may not reflect crucial advantages and 

limitations of interventions being studied 

Outcome measures may not address issues most important to 

patients and families 

Often short in duration 

Table I. Some characteristics of clinical trials of "efficacy" vs trials of 
"effectiveness." 



tionship with clinical reality by preferring study types 
with too little generalizability, greater emphasis should 
be placed on other empirical research approaches. A 
drug that has been evaluated in placebo-controlled stud- 
ies with the selection problems described above should 
also be tested in studies with less restrictive methodol- 
ogy, eg, randomized control-group studies versus a stan- 
dard drug; the results should at least show a tendency 
towards consistency. The 3-arm study design recom- 
mended by the European regulatory authority, 
EMEA/CPMP, 11 in which the experimental substance is 
compared with placebo and a standard drug, delivers 
more meaningful results but cannot avoid the problems 
associated with the extensive selection of patients since 
it still has a placebo group. Therefore, other types of 
studies traditionally considered to be phase IV should 
be part of the evaluation process. 
It should be remembered that, traditionally, there was a 
demand for a psychopharmaceutical drug to be clini- 
cally evaluated in a phase model at various method- 
ological levels of empirical research and with 
approaches of different methodological stringency. This 
means that evidence for efficacy and tolerability should 
additionally be obtained from phase IV studies, which 
are more closely oriented towards routine clinical 
care, 1217 to complement the results of phase III studies 
with their strict methodology. In such a phase model of 
clinical/pharmacological evaluation, the evidence from 
each phase is seen to be complementary and part of the 
overall evidence. This idea can no longer be found in the 
systems currently used in guidelines to assess evidence, 
since evidence is rated according to the study design 
with the most demanding methodology for the respec- 
tive therapy (eg, placebo-controlled studies) without 
ascertaining whether consistent results are available 
from less restrictive but more generalizable study types. 
A future grading of evidence that is more relevant for 
clinical reality should assess whether results are avail- 
able from studies with both high internal (eg, control- 
group studies) and high external (eg, effectiveness stud- 
ies, observational studies) validity and whether the 
results are principally congruent. So far, the current 
interest in effectiveness studies is principally posi- 
tive. 101819 However, the results of these effectiveness 
studies should not be overinterpreted due to their prin- 
cipal methodological limitations (as demonstrated, eg, 
for the Clinical Antipsychotic Trials of Intervention 
Effectiveness [CATIE] trial). 6 
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The inclusion of "confounders" (from the perspective of 
a phase III trial) such as comorbidity or comedication 
increases the variance and results in a reduced signal- 
to-noise ratio, which makes it more difficult to find dif- 
ferences between two groups (|3 error problem), even if 
these factors are adequately considered in the statistical 
analysis. It might sometimes even be difficult to judge 
without placebo conditions whether there is a real drug 
effect, especially if the pre-post difference is unexpect- 
edly low and if there are no differences between two 
active comparators. Given the fact that these pragmatic 
trials mostly compare two active compounds, it should 
be accepted on the basis of the traditional methodology 
of clinical psychopharmacological trials that only proof 
of superiority in the statistical sense counts, while the 
failure to demonstrate a statistically significant differ- 
ence cannot be interpreted as showing that both treat- 
ments are comparable. 3 The latter conclusion is not per- 
missible for principal methodological reasons. 
A different statistical design is required to demonstrate 
equivalency: the so-called equivalency design. However, 
this methodological approach is also far from the unam- 
biguity of superiority trials. For example, without a 
placebo control, which is characteristic for effectiveness 
studies, 20 23 one cannot be sure that the active drugs are 
being compared in a drug-sensitive sample (Table II). 3 
The worst-case scenario is that the drugs show no out- 
come difference because they are not effective at all in 
the respective sample. This is not as unlikely as some 
might believe. In the field of antidepressants, failed stud- 
ies — in the sense that in a 3-arm study comparing an 
experimental drug with a standard comparator and 

Advantages 



placebo not even the standard comparator (internal val- 
idator) differs from placebo — are quite common. 24 In 
recent years there has even been an increasing number 
of failed studies, especially in the United States, not only 
in the field of antidepressants but also in the field of 
antipsychotics, although the antipsychotics generally 
have a larger effect size than antidepressants. Several 
factors are relevant in this context, such as low inter- 
rater reliability, especially in huge multicenter trials, 
inclusion of less responsive patients, more chronic 
patients with residual symptomatology or comorbid 
patients, no restriction of permitted comedications, etc. 
In discussing methodological aspects of effectiveness 
studies it should be questioned whether outcome crite- 
ria such as "nondiscontinuation," or similar categorical 
end points like "level of caring," preferably applied in 
some effectiveness studies, really are ideal outcome cri- 
teria, given the fact that they can easily be influenced by 
the investigators (who may be biased by their expecta- 
tions if they are not blinded) and are of poorer psycho- 
metric value than dimensional ones. 
It can be generally questioned whether "nondiscontinu- 
ation" really reflects only efficacy and tolerability 
aspects, or whether other parameters beyond drug 
effects are also involved, eg, confidence in the thera- 
peutic concept. For example, therapeutic concepts like 
psychotherapy, herbal drug therapy, etc, might be more 
acceptable to a subgroup of patients, although they may 
have a lower level of efficacy. Different aspects of toler- 
ability can have different effects on discontinuation, 
depending on the specific tolerability problems and on 
the time patterns of side effects. Thus, one can presume 

Disadvantages 



Placebo- 


Allow estimation of the assay sensitivity and thus 


Perhaps higher risk from "nontreatment" 


controlled 


internal validation of the study 




studies 


Allow better evaluation of the clinical relevance 


Perhaps more limited generalisability of the results to the general 
population 


Smaller sample size 


Lower study costs 


Studies with 


Supply data on relative efficacy and tolerability 


Risk of false studies because assay sensitivity is lacking 


an active 


At least theoretically no inactive treatment 


Equivalence/noninferiority not suitable as proof of efficacy 


control 


Fewer dropouts due to lack of efficacy 


Active comparator may not be standard therapy 




May be more acceptable to an ethics commission 


More dropouts due to adverse events 


Tendancy to minimize efficacy differences 


Larger sample sizes 


Higher study costs 



Table II. Advantages and disadvantages of using an active control or placebo in clinical studies. 
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that severe extrapyramidal symptoms occurring right at 
the start of a study result in an early dropout, the slow 
development of weight gain rather a later dropout, and 
tardive dyskinesia (TD) or in most cases even metabolic 
disorder, a much later dropout. This means that a rough 
measurement like "discontinuation" or "time to discon- 
tinuation" causes a biased distortion per se with respect 
to the individual antipsychotics being evaluated. This 
becomes even worse if the transition from the pretreat- 
ment antipsychotic to the study antipsychotic is taken 
into consideration, in particular if it is direct, without a 
sufficiently long washout phase. Depending on the phar- 
macological profile of the respective pretreatment drug, 
for example in terms of D 2 potency, anticholinergic or 
antihistaminergic properties, and the related pharmaco- 
logical profile of the study drug, several problems can 
appear immediately after transition. 25 These can include 
reduced antipsychotic efficacy, discontinuation symp- 
toms, hangover of side effects wrongly attributed to the 
study drug, pharmacodynamic interactions in terms of 
oversedation, histaminergic, or cholinergic rebound phe- 
nomena, etc. Thus, there are good and bad combinations 
of drugs for this transition process. Theoretically, the best 
transition is one in which the pretreatment and the study 
drug are identical. There are also other critical issues that 
need to be considered in this context. 26,27 

Quality of life 

Another preferred measure of global outcome used as a 
primary outcome criterion in some effectiveness studies 
is "quality of life." There is no doubt that this is an 
important outcome criterion which reflects the subjec- 
tive dimension of the patient's experience. 28 30 The clas- 
sical approach in quality of life research assesses quality 
of life using a self -rating scale in order to guarantee the 
subjective perspective. The SF36 3132 is particularly widely 
used in psychiatry as well as in other fields of medicine, 
but there are also several other scales to assess this 
dimension. 33 35 This leads to the general problem of self- 
rating approaches for the assessment of the primary out- 
come, if they are not complemented by an observer rat- 
ing approach. For example, the Sequenced Treatment 
Alternatives to Relieve Depression (STAR*D) study 18 
widely relies on self-rating results to assess outcome in 
terms of depression severity. 9 

Generally, there are pros and cons for the use of self-rat- 
ing scales. They give a complementary view to the observer 



rating of the same construct/dimension. 3637 The correlation 
between the observer ratings and self-ratings might not be 
high and may be quite changeable, depending on the psy- 
chopathological state in terms of severity and type of 
symptoms. 38 It is often unclear exactly what self-ratings of 
quality of life reflect; severity of the psychopathological 
state in the global sense, certain dimensions of the psy- 
chopathological state, eg, depression, current mood more 
than real depressive symptoms, side effects of drugs, or the 
psychosocial situation. 2939 " 13 If such a scale is used as the pri- 
mary outcome criterion of a study, it is doubtful whether 
it is sensitive enough to detect intergroup differences in 
treatment-induced changes, given the high variance of self- 
rating in general and of self -ratings of quality of life in par- 
ticular. For example, not many of the studies on antipsy- 
chotics that used a quality of life scale as a secondary 
outcome criterion found significant intergroup differ- 
ences. 2929 Thus, the use of a quality of life scale carries a 
high risk of not finding significant differences between two 
drugs, especially if both are active drugs. 
Do effectiveness studies generally fulfil their claim of 
treating less selective samples of patients than phase III 
studies? At least some apparently do not. For example, 
in the effectiveness study comparing olanzapine and 
haloperidol in the treatment of schizophrenia, 44 of the 
4386 patients assessed for eligibility, only 309 were 
included in the study (7.0%). This rate is even somewhat 
lower than the usual rate of 10% to 15% in phase III 
studies. 45 Some effectiveness studies appear to have a dif- 
ferent kind of selection of patients than phase III trials. 
Often, patients with milder and more chronic symptoms 
may be selected than is the case in phase III studies, thus 
making it more difficult per se to demonstrate drug 
effects and in particular differences between drug effects, 
because a relevant subgroup of patients might be par- 
tially unresponsive to a drug. The data from the Cost 
Utility of the Latest Antipsychotics in Severe 
Schizophrenia (CUtLASS) study serve as an example 
here. In this study, the pre-post changes in the Positive 
And Negative Symptom Scale (PANSS) positive score 
after 52 weeks amounted to only 2.0 in the first-gener- 
ation antipsychotic (FGA) arm and 1.5 in the second- 
generation antipsychotic (SGA) arm; these changes are 
extremely low, even when one takes into account that 
this study was not an acute treatment study but rather a 
switch study in partially improved/stabilized patients. 
Also CATIE 46 and STAR*D 47 patients seem to be more 
on the chronic and even partially refractory pole. 
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In order to understand some of the methodological 
problems of "effectiveness" studies in more detail, the 
respective review by Moller on effectiveness studies in 
the field of antipsychotics 6 should be taken into consid- 
eration. It is interesting that some of these studies were 
published in high-ranking journals, although some of 
them have considerable methodological shortcomings 
which mean that the conclusions drawn are not tenable, 
especially not when they are used to falsify the results of 
phase III studies. Most of these studies arrived at the 
result that SGAs were generally not superior to FGAs 
and are thus faced with the comment that not proving 
superiority does not mean equivalence. The EUFEST 
study was the only able to demonstrate superiority of 
SGAs vs haloperidol. A finding of superiority is, for prin- 
cipal methodological reasons (see above) more valid, 
especially when considering the increased number of 
confounders in effectiveness studies, than the finding of 
no statistical differences, which is always difficult to 
interpret. 

The CATIE study 

The most famous of effectiveness studies on antipsy- 
chotics is the CATIE study. 10 There is no doubt that the 
CATIE study is an important study when one considers, 
for example, the large sample size (N=1493 in 57 cen- 
ters), the complex design with several parallel treatment 
arms, the 18-month duration of treatment of the first 
phase, inclusion of sequential treatment phases, etc 
(phase 1 of the study was published in 2005 10 ). Also, the 
double-blind conditions of this study and the sophisti- 
cated and comprehensive statistical analysis of the 
extensive database are appealing. The study has received 
a lot of publicity, particularly in the general press, where 
it was portrayed as showing that SGAs are for the most 
part not better, but much more expensive, than FGAs. 
This conclusion is not tenable because of the method- 
ological failings described above and elsewhere. 6 48,49 
However, to end on a more positive note, many other 
results not only from phase 1 but also phase 2 and 3 are 
of relevance for clinicians, eg, on different side-effect 
patterns of individual SGAs, on metabolic issues, on 
meaningful sequences of antipsychotic treatment in case 
of partial nonresponse, on the unique efficacy of cloza- 
pine in refractory patients, etc. 46,50 
In the field of antidepressants there are not so many 
effectiveness studies. To mention one there is the "Texas 



Algorithm Study" which tried to demonstrate the supe- 
riority of the algorithm approach in treating depressive 
patients by comparing treatment outcome of depressive 
patients from two different hospitals. The outcome was 
more advantageous in the hospital where the algorithm 
had been applied. However, the weakness of this study 
was the baseline differences in the two samples, indicat- 
ing that the patients in the algorithm sample probably 
had a more positive prognosis. Two other studies which 
evaluated the algorithm approach in a "real-world" RCT 
could confirm the superiority of the treatment strat- 
egy. 51,52 

The most famous effectiveness study in the field of 
depression treatment is the STAR*D study. 53 Even more 
than the CATIE study, this study was a gigantic 
endeavor in terms of sample size, complexity in design, 
etc. It investigated under unblinded conditions two dif- 
ferent sequential treatment approaches in depressive 
outpatients, who were randomized at baseline to two dif- 
ferent groups. At each level of the complex treatment 
algorithm the outcome difference between the different 
groups were evaluated. The methodological problems of 
this study include the low Hamilton Depression Rating 
Scale (HAMD) inclusion criteria (HAMD >14), the 
recruitment of more or less chronic patients in poor psy- 
chosocial conditions, overly optimistic power calcula- 
tions with the consequence that latest for level 3 and 4 
the study did not have the necessary power to detect 
clinically relevant differences. None of the different drug 
treatment approaches on each level of the sequential 
treatment algorithm was statistically superior to any of 
the others; at most some showed a numerical degree of 
superiority. This "real- world" study reached no clear effi- 
cacy results due to inherent methodological problems. 
From a statistical point of view it does not seem unprob- 
lematic that eg, the STAR*D study data were used to 
generate about 100 publications answering different 
questions, each of which reporting results based on mul- 
tiple testings. Given all these problems it has to be ques- 
tioned whether many really clinically relevant conclu- 
sions can be drawn from this study. 
Of special methodological interest is the finding that the 
outcome difference between an a posteriori defined effi- 
cacy sample and an effectiveness sample was not as huge 
as hypothesized. 54 This finding was supported by the 
results of a naturalistic study on about 1000 depressive 
inpatients where a similar approach of subdividing the 
sample a posteriori had been applied. 55 These findings 
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underline that although there are differences in the sam- 
ple characteristics of phase III trials and "real-world" tri- 
als, 56 the relevance for a different outcome does not have 
to be as huge as anticipated. Thus, phase III studies are 
apparently more than only "proof of concept" studies, 
but have some, although limited, generalizability for 
real-world patients. 

Summary and conclusions 

Effectiveness studies can contribute to our knowledge 
about the use and effectiveness of medications. They 
help to understand that even novel/expensive drugs have 
their limitations and that it may not be possible to 
demonstrate consistently their hypothesized superiority 
in terms of efficacy, safety, compliance, quality of life, etc 
under "real-world" conditions in chronic, partially refrac- 
tory, or comorbid patients. In general they can also sup- 
ply interesting data on dosing issues, sequences of drugs 
in case of partial response and side-effect patterns. 
Altogether, the effectiveness studies seem to have a lot 
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Estudios de eficacia: ventajas y desventajas 

En los ultimos ahos los asi llamados "estudios de efi- 
cacia ", tambien denominados "estudios del mundo 
real" o "ensayos pragmaticos" han ganado una 
importancia creciente en el contexto de la medicina 
basada en la evidencia. Estos estudios siguen estan- 
dares metodologicos menos restrictivos que los estu- 
dios de fase III en terminos de la seleccion de pacien- 
tes, la comedicacion y otros temas del diseno, y por lo 
tanto sus resultados deben ser mas generalizables 
que los de los ensayos de fase III. Los estudios de efi- 
cacia, como otros tipos de estudios de fase IV, pueden 
por lo tanto contribuir al conocimiento de los medi- 
camentos y aportar informacion relevante ademas de 
la que se obtiene de los ensayos de fase III. Sin 
embargo, el diseno menos restrictivo y los problemas 
metodologicos inherentes a los estudios de fase IV tie- 
nen que ser considerados cuidadosamente. Por ejem- 
plo, la mayor varianza causada por los diferentes 
tipos de confundentes asi como los temas de disenos 
problematicos, tales como los criterios para los resul- 
tados primarios indiferentes, las condiciones de tra- 
tamientos no ciegos, la inclusion de pacientes croni- 
cos refractarios, etc. pueden llevar a conclusiones 
erroneas. Debido a estos problemas metodologicos, 
los estudios de eficacia se encuentran principalmente 
en un nivel de evidencia mas bajo, agregando solo 
una vision complementaria a los resultados de los 
estudios de fase III sin desmentir sus resultados. 
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Avantages et inconvenients des etudes 
d'efficacite 

Ces dernieres annees, les « etudes d'efficacite », aussi 
appelees « etudes en conditions reelles » ou « essais 
pragmatiques » ont acquis une importance croissante 
dans le contexte de la medecine basee sur les preuves. 
Ces etudes suivent des standards methodologiques 
moins restrictifs que les etudes de phase 3 en termes 
de selection des patients, de traitement concomitant 
et d'autres problemes de conception ; leurs resultats 
peuvent done etre plus facilement generalises que 
ceux des etudes de phase 3. Les etudes d'efficacite, 
comme d'autres types d'etudes de phase 4, peuvent 
done contribuer a la connaissance des traitements et 
fournir une information pertinente, s'ajoutant a celle 
des etudes de phase 3. II faut cependant soigneuse- 
ment prendre en compte leur schema moins restric- 
tif, et les problemes methodologiques inherents aux 
etudes de phase 4. Par exemple, une plus grande 
variance due a differentes sortes de variables confon- 
dantes et a des questions delicates de conception, 
comme des criteres de jugement primaires non sen- 
sibles, des traitements qui n'ont pas ete faits en 
aveugle, une inclusion de patients chroniques refrac- 
taires etc. . . peuvent conduire a des conclusions erro- 
nees. Les etudes d'efficacite, du fait de ces problemes 
methodologiques, sont d'un niveau de preuve nette- 
ment plus bas, n'apportant qu'un regard comple- 
mentaire sur les resultats des etudes de phase 3, sans 
les falsifier. 
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